In [1]:
import numpy as np
# Using floats (doubles) are important later
xor = np.array([[0, 0, 0],
[0, 1, 1],
[1, 0, 1],
[1, 1, 0]], dtype=float)
print(xor)
The input is the first two columns, and the target is the last column. Note that the shape of the data is important.
In [2]:
xor_in = xor[:, :2]
xor_target = xor[:, -1:]
print(xor_in)
print(xor_target)
There are different networks depending on the training method one wishes to use. GeneticNetwork uses a genetic algorithm as is the most flexible when it comes to error functions. RpropNetwork uses the iRprop+ method and is the fastest but requires a differentiable error function.
Since I will use the mean square error, I'll select Rprop.
All types share the same constructor signature, taking the number of input, hidden and output neurons respectively:
net = NETWORKTYPE(input_count, hidden_count, output_count)
In [3]:
from ann import rpropnetwork
# Create a network matching the data, with a couple of hidden neurons
net = rpropnetwork(xor_in.shape[1], 4, xor_target.shape[1])
# Total amount of neurons (including the bias neuron, of which there is only one)
neuron_count = net.input_count + net.hidden_count + net.output_count + 1
# All zero connections at first
print("Default connections")
print(net.connections.reshape((neuron_count, neuron_count)))
# All weights are zero too
print("\nDefault weights")
print(net.weights.reshape((neuron_count, neuron_count)))
print("\nDefault activation functions are linear ({})".format(net.LINEAR))
print(net.activation_functions)
By default, the network has no connections between neurons. You are able to set both the weights and connections to values of your liking but a convenience method is supplied for creating feedforward networks.
The connections and weights are defined as NxN matrices of ints and doubles respectively. Activation functions can also be set on the individual neuron level if desired using the N-length vector activation_functions. To do interesting stuff, make sure hidden neurons use either tanh or logsig. These are set for you by the connect_feedforward method.
In [4]:
from ann import connect_feedforward
# Connect in a single hidden layer (default) with logsig functions on both
# hidden and outputs (also default)
connect_feedforward(net)
print("\n\nFeedforward connections")
print(net.connections.reshape((neuron_count, neuron_count)))
print("\nWeights have been randomized and normalized to suitable ranges")
print(net.weights.reshape((neuron_count, neuron_count)))
print("\nActivation functions are now changed to logsig ({})".format(net.LOGSIG))
print(net.activation_functions)
print("\nInputs and Bias have no activation functions, or connections to other neurons")
In [5]:
print("Error function:", net.error_function, net.ERROR_MSE)
print("Max training iterations:", net.maxEpochs)
print("Max error accepted for early stopping:", net.maxError)
print("Min change in error to consider training to be done:", net.minErrorFrac)
In [6]:
net.learn(xor_in, xor_target)
In [7]:
outputs = []
for x in xor_in:
y = net.output(x)
print("{:.0f} X {:.0f} = {:.1f}".format(x[0], x[1], y[0]))
outputs.append(y)
outputs = np.array(outputs)
# Note that y is an array
y.shape == (net.output_count,)
Out[7]:
Mean square error is defined as:
$$ e = \frac{1}{N} \sum_i^N (\tau_i - y_i)^2 $$The package however divides by two, and neglects the N-term, to make the differential $(\tau_i - y_i)$ instead of:
$$ \frac{de}{dy_i} = \frac{2}{N}(\tau_i - y_i) $$Just to remove some calculations from the process. It has no impact on training.
In [8]:
# Mean square error then
e = np.sum((xor_target - outputs)**2) / len(xor_target)
print("MSE: {:.6f}".format(e))
# Can also ask the package to calculate it for us
from ann import get_error
e = get_error(net.ERROR_MSE, xor_target, outputs)
# This is not summed for us as it is used in training piece by piece
print("MSE: {:.6f}".format(2 * e.sum()/len(xor_target)))